Overview

Brought to you by YData

Dataset statistics

Number of variables14
Number of observations45000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.0 MiB
Average record size in memory348.8 B

Variable types

Numeric8
Categorical5
Boolean1

Alerts

cb_person_cred_hist_length is highly overall correlated with person_age and 1 other fieldsHigh correlation
loan_amnt is highly overall correlated with loan_percent_incomeHigh correlation
loan_percent_income is highly overall correlated with loan_amntHigh correlation
loan_status is highly overall correlated with previous_loan_defaults_on_fileHigh correlation
person_age is highly overall correlated with cb_person_cred_hist_length and 1 other fieldsHigh correlation
person_emp_exp is highly overall correlated with cb_person_cred_hist_length and 1 other fieldsHigh correlation
previous_loan_defaults_on_file is highly overall correlated with loan_statusHigh correlation
person_income is highly skewed (γ1 = 34.13758313) Skewed
person_emp_exp has 9566 (21.3%) zeros Zeros

Reproduction

Analysis started2024-12-19 20:07:50.162912
Analysis finished2024-12-19 20:07:53.235966
Duration3.07 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

person_age
Real number (ℝ)

High correlation 

Distinct60
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.764178
Minimum20
Maximum144
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:53.268314image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile22
Q124
median26
Q330
95-th percentile39
Maximum144
Range124
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.0451082
Coefficient of variation (CV)0.2177305
Kurtosis18.649449
Mean27.764178
Median Absolute Deviation (MAD)3
Skewness2.548154
Sum1249388
Variance36.543333
MonotonicityNot monotonic
2024-12-19T15:07:53.313679image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23 5254
11.7%
24 5138
11.4%
25 4507
10.0%
22 4236
9.4%
26 3659
 
8.1%
27 3095
 
6.9%
28 2728
 
6.1%
29 2455
 
5.5%
30 2021
 
4.5%
31 1645
 
3.7%
Other values (50) 10262
22.8%
ValueCountFrequency (%)
20 17
 
< 0.1%
21 1289
 
2.9%
22 4236
9.4%
23 5254
11.7%
24 5138
11.4%
25 4507
10.0%
26 3659
8.1%
27 3095
6.9%
28 2728
6.1%
29 2455
5.5%
ValueCountFrequency (%)
144 3
< 0.1%
123 2
< 0.1%
116 1
 
< 0.1%
109 1
 
< 0.1%
94 1
 
< 0.1%
84 1
 
< 0.1%
80 1
 
< 0.1%
78 1
 
< 0.1%
76 1
 
< 0.1%
73 3
< 0.1%

person_gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
male
24841 
female
20159 

Length

Max length6
Median length4
Mean length4.8959556
Min length4

Characters and Unicode

Total characters220318
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfemale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 24841
55.2%
female 20159
44.8%

Length

2024-12-19T15:07:53.357859image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T15:07:53.397673image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
male 24841
55.2%
female 20159
44.8%

Most occurring characters

ValueCountFrequency (%)
e 65159
29.6%
m 45000
20.4%
a 45000
20.4%
l 45000
20.4%
f 20159
 
9.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 220318
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 65159
29.6%
m 45000
20.4%
a 45000
20.4%
l 45000
20.4%
f 20159
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 220318
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 65159
29.6%
m 45000
20.4%
a 45000
20.4%
l 45000
20.4%
f 20159
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 220318
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 65159
29.6%
m 45000
20.4%
a 45000
20.4%
l 45000
20.4%
f 20159
 
9.1%

person_education
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
Bachelor
13399 
Associate
12028 
High School
11972 
Master
6980 
Doctorate
 
621

Length

Max length11
Median length9
Mean length8.769
Min length6

Characters and Unicode

Total characters394605
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMaster
2nd rowHigh School
3rd rowHigh School
4th rowBachelor
5th rowMaster

Common Values

ValueCountFrequency (%)
Bachelor 13399
29.8%
Associate 12028
26.7%
High School 11972
26.6%
Master 6980
15.5%
Doctorate 621
 
1.4%

Length

2024-12-19T15:07:53.433028image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T15:07:53.468972image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
bachelor 13399
23.5%
associate 12028
21.1%
high 11972
21.0%
school 11972
21.0%
master 6980
12.3%
doctorate 621
 
1.1%

Most occurring characters

ValueCountFrequency (%)
o 50613
12.8%
c 38020
9.6%
h 37343
9.5%
e 33028
8.4%
a 33028
8.4%
s 31036
 
7.9%
l 25371
 
6.4%
i 24000
 
6.1%
r 21000
 
5.3%
t 20250
 
5.1%
Other values (8) 80916
20.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 394605
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 50613
12.8%
c 38020
9.6%
h 37343
9.5%
e 33028
8.4%
a 33028
8.4%
s 31036
 
7.9%
l 25371
 
6.4%
i 24000
 
6.1%
r 21000
 
5.3%
t 20250
 
5.1%
Other values (8) 80916
20.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 394605
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 50613
12.8%
c 38020
9.6%
h 37343
9.5%
e 33028
8.4%
a 33028
8.4%
s 31036
 
7.9%
l 25371
 
6.4%
i 24000
 
6.1%
r 21000
 
5.3%
t 20250
 
5.1%
Other values (8) 80916
20.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 394605
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 50613
12.8%
c 38020
9.6%
h 37343
9.5%
e 33028
8.4%
a 33028
8.4%
s 31036
 
7.9%
l 25371
 
6.4%
i 24000
 
6.1%
r 21000
 
5.3%
t 20250
 
5.1%
Other values (8) 80916
20.5%

person_income
Real number (ℝ)

Skewed 

Distinct33989
Distinct (%)75.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80319.053
Minimum8000
Maximum7200766
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:53.511682image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum8000
5-th percentile28366.7
Q147204
median67048
Q395789.25
95-th percentile166754.7
Maximum7200766
Range7192766
Interquartile range (IQR)48585.25

Descriptive statistics

Standard deviation80422.499
Coefficient of variation (CV)1.0012879
Kurtosis2398.6848
Mean80319.053
Median Absolute Deviation (MAD)23124
Skewness34.137583
Sum3.6143574 × 109
Variance6.4677783 × 109
MonotonicityNot monotonic
2024-12-19T15:07:53.555368image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8000 15
 
< 0.1%
73011 10
 
< 0.1%
36995 9
 
< 0.1%
60914 8
 
< 0.1%
37020 8
 
< 0.1%
73082 7
 
< 0.1%
60864 7
 
< 0.1%
67131 7
 
< 0.1%
72951 7
 
< 0.1%
73040 7
 
< 0.1%
Other values (33979) 44915
99.8%
ValueCountFrequency (%)
8000 15
< 0.1%
8037 1
 
< 0.1%
8104 1
 
< 0.1%
8186 1
 
< 0.1%
8248 1
 
< 0.1%
8267 1
 
< 0.1%
8277 1
 
< 0.1%
8302 1
 
< 0.1%
8518 1
 
< 0.1%
9364 1
 
< 0.1%
ValueCountFrequency (%)
7200766 1
< 0.1%
5556399 1
< 0.1%
5545545 1
< 0.1%
2448661 1
< 0.1%
2280980 1
< 0.1%
2139143 1
< 0.1%
2012954 1
< 0.1%
1741243 1
< 0.1%
1728974 1
< 0.1%
1661567 1
< 0.1%

person_emp_exp
Real number (ℝ)

High correlation  Zeros 

Distinct63
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4103333
Minimum0
Maximum125
Zeros9566
Zeros (%)21.3%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:53.598429image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median4
Q38
95-th percentile17
Maximum125
Range125
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.0635321
Coefficient of variation (CV)1.1207317
Kurtosis19.168324
Mean5.4103333
Median Absolute Deviation (MAD)3
Skewness2.5949174
Sum243465
Variance36.766421
MonotonicityNot monotonic
2024-12-19T15:07:53.721034image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9566
21.3%
2 4134
9.2%
1 4061
9.0%
3 3890
8.6%
4 3524
 
7.8%
5 3000
 
6.7%
6 2717
 
6.0%
7 2204
 
4.9%
8 1890
 
4.2%
9 1575
 
3.5%
Other values (53) 8439
18.8%
ValueCountFrequency (%)
0 9566
21.3%
1 4061
9.0%
2 4134
9.2%
3 3890
8.6%
4 3524
 
7.8%
5 3000
 
6.7%
6 2717
 
6.0%
7 2204
 
4.9%
8 1890
 
4.2%
9 1575
 
3.5%
ValueCountFrequency (%)
125 1
< 0.1%
124 1
< 0.1%
121 1
< 0.1%
101 1
< 0.1%
100 1
< 0.1%
93 1
< 0.1%
85 1
< 0.1%
76 1
< 0.1%
62 1
< 0.1%
61 1
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
RENT
23443 
MORTGAGE
18489 
OWN
2951 
OTHER
 
117

Length

Max length8
Median length4
Mean length5.5804889
Min length3

Characters and Unicode

Total characters251122
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRENT
2nd rowOWN
3rd rowMORTGAGE
4th rowRENT
5th rowRENT

Common Values

ValueCountFrequency (%)
RENT 23443
52.1%
MORTGAGE 18489
41.1%
OWN 2951
 
6.6%
OTHER 117
 
0.3%

Length

2024-12-19T15:07:53.761928image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T15:07:53.795102image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
rent 23443
52.1%
mortgage 18489
41.1%
own 2951
 
6.6%
other 117
 
0.3%

Most occurring characters

ValueCountFrequency (%)
R 42049
16.7%
E 42049
16.7%
T 42049
16.7%
G 36978
14.7%
N 26394
10.5%
O 21557
8.6%
M 18489
7.4%
A 18489
7.4%
W 2951
 
1.2%
H 117
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 251122
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
R 42049
16.7%
E 42049
16.7%
T 42049
16.7%
G 36978
14.7%
N 26394
10.5%
O 21557
8.6%
M 18489
7.4%
A 18489
7.4%
W 2951
 
1.2%
H 117
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 251122
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
R 42049
16.7%
E 42049
16.7%
T 42049
16.7%
G 36978
14.7%
N 26394
10.5%
O 21557
8.6%
M 18489
7.4%
A 18489
7.4%
W 2951
 
1.2%
H 117
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 251122
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
R 42049
16.7%
E 42049
16.7%
T 42049
16.7%
G 36978
14.7%
N 26394
10.5%
O 21557
8.6%
M 18489
7.4%
A 18489
7.4%
W 2951
 
1.2%
H 117
 
< 0.1%

loan_amnt
Real number (ℝ)

High correlation 

Distinct4483
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9583.1576
Minimum500
Maximum35000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:53.834110image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile2000
Q15000
median8000
Q312237.25
95-th percentile24000
Maximum35000
Range34500
Interquartile range (IQR)7237.25

Descriptive statistics

Standard deviation6314.8867
Coefficient of variation (CV)0.65895678
Kurtosis1.3512152
Mean9583.1576
Median Absolute Deviation (MAD)3800
Skewness1.1797313
Sum4.3124209 × 108
Variance39877794
MonotonicityNot monotonic
2024-12-19T15:07:53.877158image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 3617
 
8.0%
5000 2787
 
6.2%
6000 2426
 
5.4%
12000 2416
 
5.4%
15000 2004
 
4.5%
8000 1928
 
4.3%
4000 1406
 
3.1%
20000 1385
 
3.1%
3000 1378
 
3.1%
7000 1314
 
2.9%
Other values (4473) 24339
54.1%
ValueCountFrequency (%)
500 5
< 0.1%
563 1
 
< 0.1%
700 1
 
< 0.1%
725 1
 
< 0.1%
750 1
 
< 0.1%
800 1
 
< 0.1%
900 2
 
< 0.1%
912 1
 
< 0.1%
922 1
 
< 0.1%
950 1
 
< 0.1%
ValueCountFrequency (%)
35000 234
0.5%
34826 1
 
< 0.1%
34800 1
 
< 0.1%
34664 1
 
< 0.1%
34375 1
 
< 0.1%
34322 1
 
< 0.1%
34121 1
 
< 0.1%
34000 4
 
< 0.1%
33950 2
 
< 0.1%
33800 1
 
< 0.1%

loan_intent
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
EDUCATION
9153 
MEDICAL
8548 
VENTURE
7819 
PERSONAL
7552 
DEBTCONSOLIDATION
7145 

Length

Max length17
Median length15
Mean length10.012711
Min length7

Characters and Unicode

Total characters450572
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPERSONAL
2nd rowEDUCATION
3rd rowMEDICAL
4th rowMEDICAL
5th rowMEDICAL

Common Values

ValueCountFrequency (%)
EDUCATION 9153
20.3%
MEDICAL 8548
19.0%
VENTURE 7819
17.4%
PERSONAL 7552
16.8%
DEBTCONSOLIDATION 7145
15.9%
HOMEIMPROVEMENT 4783
10.6%

Length

2024-12-19T15:07:53.916645image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T15:07:53.952493image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
education 9153
20.3%
medical 8548
19.0%
venture 7819
17.4%
personal 7552
16.8%
debtconsolidation 7145
15.9%
homeimprovement 4783
10.6%

Most occurring characters

ValueCountFrequency (%)
E 62385
13.8%
O 47706
10.6%
N 43597
9.7%
I 36774
8.2%
T 36045
8.0%
A 32398
 
7.2%
D 31991
 
7.1%
C 24846
 
5.5%
L 23245
 
5.2%
M 22897
 
5.1%
Other values (7) 88688
19.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 450572
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 62385
13.8%
O 47706
10.6%
N 43597
9.7%
I 36774
8.2%
T 36045
8.0%
A 32398
 
7.2%
D 31991
 
7.1%
C 24846
 
5.5%
L 23245
 
5.2%
M 22897
 
5.1%
Other values (7) 88688
19.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 450572
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 62385
13.8%
O 47706
10.6%
N 43597
9.7%
I 36774
8.2%
T 36045
8.0%
A 32398
 
7.2%
D 31991
 
7.1%
C 24846
 
5.5%
L 23245
 
5.2%
M 22897
 
5.1%
Other values (7) 88688
19.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 450572
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 62385
13.8%
O 47706
10.6%
N 43597
9.7%
I 36774
8.2%
T 36045
8.0%
A 32398
 
7.2%
D 31991
 
7.1%
C 24846
 
5.5%
L 23245
 
5.2%
M 22897
 
5.1%
Other values (7) 88688
19.7%

loan_int_rate
Real number (ℝ)

Distinct1302
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.006606
Minimum5.42
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:53.997586image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum5.42
5-th percentile6.17
Q18.59
median11.01
Q312.99
95-th percentile16
Maximum20
Range14.58
Interquartile range (IQR)4.4

Descriptive statistics

Standard deviation2.9788083
Coefficient of variation (CV)0.27063823
Kurtosis-0.42033531
Mean11.006606
Median Absolute Deviation (MAD)2.13
Skewness0.21378407
Sum495297.26
Variance8.8732988
MonotonicityNot monotonic
2024-12-19T15:07:54.041169image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.01 3329
 
7.4%
10.99 804
 
1.8%
7.51 798
 
1.8%
7.49 687
 
1.5%
7.88 673
 
1.5%
5.42 608
 
1.4%
7.9 606
 
1.3%
11.49 514
 
1.1%
9.99 484
 
1.1%
13.49 475
 
1.1%
Other values (1292) 36022
80.0%
ValueCountFrequency (%)
5.42 608
1.4%
5.43 2
 
< 0.1%
5.44 2
 
< 0.1%
5.46 1
 
< 0.1%
5.47 5
 
< 0.1%
5.48 4
 
< 0.1%
5.49 4
 
< 0.1%
5.5 1
 
< 0.1%
5.51 3
 
< 0.1%
5.52 2
 
< 0.1%
ValueCountFrequency (%)
20 84
0.2%
19.91 9
 
< 0.1%
19.9 1
 
< 0.1%
19.82 5
 
< 0.1%
19.8 1
 
< 0.1%
19.79 4
 
< 0.1%
19.74 4
 
< 0.1%
19.69 12
 
< 0.1%
19.66 3
 
< 0.1%
19.62 1
 
< 0.1%

loan_percent_income
Real number (ℝ)

High correlation 

Distinct64
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.13972489
Minimum0
Maximum0.66
Zeros27
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:54.084532image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.03
Q10.07
median0.12
Q30.19
95-th percentile0.31
Maximum0.66
Range0.66
Interquartile range (IQR)0.12

Descriptive statistics

Standard deviation0.087212308
Coefficient of variation (CV)0.6241716
Kurtosis1.0824162
Mean0.13972489
Median Absolute Deviation (MAD)0.05
Skewness1.0345122
Sum6287.62
Variance0.0076059867
MonotonicityNot monotonic
2024-12-19T15:07:54.127220image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.08 2593
 
5.8%
0.1 2421
 
5.4%
0.07 2415
 
5.4%
0.09 2295
 
5.1%
0.06 2242
 
5.0%
0.12 2216
 
4.9%
0.05 2176
 
4.8%
0.11 2158
 
4.8%
0.14 1960
 
4.4%
0.04 1950
 
4.3%
Other values (54) 22574
50.2%
ValueCountFrequency (%)
0 27
 
0.1%
0.01 315
 
0.7%
0.02 944
 
2.1%
0.03 1488
3.3%
0.04 1950
4.3%
0.05 2176
4.8%
0.06 2242
5.0%
0.07 2415
5.4%
0.08 2593
5.8%
0.09 2295
5.1%
ValueCountFrequency (%)
0.66 1
 
< 0.1%
0.63 1
 
< 0.1%
0.62 2
 
< 0.1%
0.61 2
 
< 0.1%
0.59 1
 
< 0.1%
0.58 1
 
< 0.1%
0.57 1
 
< 0.1%
0.56 5
< 0.1%
0.55 5
< 0.1%
0.54 8
< 0.1%

cb_person_cred_hist_length
Real number (ℝ)

High correlation 

Distinct29
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.8674889
Minimum2
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:54.164252image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q13
median4
Q38
95-th percentile14
Maximum30
Range28
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.8797018
Coefficient of variation (CV)0.66122014
Kurtosis3.7259445
Mean5.8674889
Median Absolute Deviation (MAD)2
Skewness1.63172
Sum264037
Variance15.052086
MonotonicityNot monotonic
2024-12-19T15:07:54.198736image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
4 8653
19.2%
3 8312
18.5%
2 6537
14.5%
5 3082
 
6.8%
6 2966
 
6.6%
7 2889
 
6.4%
8 2800
 
6.2%
9 2685
 
6.0%
10 2457
 
5.5%
12 715
 
1.6%
Other values (19) 3904
8.7%
ValueCountFrequency (%)
2 6537
14.5%
3 8312
18.5%
4 8653
19.2%
5 3082
 
6.8%
6 2966
 
6.6%
7 2889
 
6.4%
8 2800
 
6.2%
9 2685
 
6.0%
10 2457
 
5.5%
11 712
 
1.6%
ValueCountFrequency (%)
30 23
0.1%
29 15
< 0.1%
28 29
0.1%
27 23
0.1%
26 20
< 0.1%
25 23
0.1%
24 34
0.1%
23 26
0.1%
22 32
0.1%
21 24
0.1%

credit_score
Real number (ℝ)

Distinct340
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean632.60876
Minimum390
Maximum850
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size351.7 KiB
2024-12-19T15:07:54.237879image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum390
5-th percentile539
Q1601
median640
Q3670
95-th percentile703
Maximum850
Range460
Interquartile range (IQR)69

Descriptive statistics

Standard deviation50.435865
Coefficient of variation (CV)0.079726789
Kurtosis0.20302186
Mean632.60876
Median Absolute Deviation (MAD)33
Skewness-0.61026083
Sum28467394
Variance2543.7765
MonotonicityNot monotonic
2024-12-19T15:07:54.280850image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
658 406
 
0.9%
649 398
 
0.9%
652 396
 
0.9%
663 394
 
0.9%
647 393
 
0.9%
650 391
 
0.9%
654 391
 
0.9%
667 390
 
0.9%
653 390
 
0.9%
656 386
 
0.9%
Other values (330) 41065
91.3%
ValueCountFrequency (%)
390 1
 
< 0.1%
418 1
 
< 0.1%
419 1
 
< 0.1%
420 1
 
< 0.1%
421 1
 
< 0.1%
430 1
 
< 0.1%
431 2
< 0.1%
434 1
 
< 0.1%
435 4
< 0.1%
437 2
< 0.1%
ValueCountFrequency (%)
850 1
< 0.1%
807 1
< 0.1%
805 1
< 0.1%
792 1
< 0.1%
789 1
< 0.1%
784 2
< 0.1%
773 1
< 0.1%
772 1
< 0.1%
770 1
< 0.1%
768 1
< 0.1%

previous_loan_defaults_on_file
Boolean

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size44.1 KiB
True
22858 
False
22142 
ValueCountFrequency (%)
True 22858
50.8%
False 22142
49.2%
2024-12-19T15:07:54.316510image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

loan_status
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
0
35000 
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 35000
77.8%
1 10000
 
22.2%

Length

2024-12-19T15:07:54.348621image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-19T15:07:54.379161image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 35000
77.8%
1 10000
 
22.2%

Most occurring characters

ValueCountFrequency (%)
0 35000
77.8%
1 10000
 
22.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 45000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 35000
77.8%
1 10000
 
22.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 45000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 35000
77.8%
1 10000
 
22.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 45000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 35000
77.8%
1 10000
 
22.2%

Interactions

2024-12-19T15:07:52.813464image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:50.700077image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.084631image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.435403image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.693762image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.958946image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.222226image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.553117image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.845203image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:50.753735image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.118847image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.468848image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.726532image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.991768image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.321330image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.585409image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.877721image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:50.815302image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.218018image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.502073image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.760451image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.025697image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.355595image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.618947image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.908491image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:50.865057image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.252551image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.533191image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.793889image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.058677image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.388809image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.650951image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.941018image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:50.933830image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.295300image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.566488image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.827321image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.092570image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.423075image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.684545image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.973461image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:50.984831image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.335045image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.599340image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.861427image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.125789image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.457185image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.719870image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:53.004989image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.019359image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.368954image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.632088image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.895071image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.159220image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.489792image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.752738image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:53.035517image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.052706image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.402627image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.664064image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:51.927958image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.191658image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.521915image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-12-19T15:07:52.783312image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2024-12-19T15:07:54.405658image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
cb_person_cred_hist_lengthcredit_scoreloan_amntloan_int_rateloan_intentloan_percent_incomeloan_statusperson_ageperson_educationperson_emp_expperson_genderperson_home_ownershipperson_incomeprevious_loan_defaults_on_file
cb_person_cred_hist_length1.0000.1420.0430.0170.054-0.0370.0200.8210.0910.7500.0260.0280.0930.026
credit_score0.1421.0000.0060.0110.016-0.0120.0080.1600.1290.1720.0050.0000.0230.178
loan_amnt0.0430.0061.0000.1050.0300.6660.1260.0640.0000.0520.0050.0900.4050.066
loan_int_rate0.0170.0110.1051.0000.0170.1240.3630.0130.0040.0160.0000.084-0.0330.198
loan_intent0.0540.0160.0300.0171.0000.0180.1420.0300.0120.0290.0000.0820.0100.080
loan_percent_income-0.037-0.0120.6660.1240.0181.0000.415-0.0560.000-0.0500.0000.091-0.3530.220
loan_status0.0200.0080.1260.3630.1420.4151.0000.0120.0000.0140.0000.2580.0090.543
person_age0.8210.1600.0640.0130.030-0.0560.0121.0000.0600.8880.0240.0150.1430.030
person_education0.0910.1290.0000.0040.0120.0000.0000.0601.0000.0650.0000.0060.0040.040
person_emp_exp0.7500.1720.0520.0160.029-0.0500.0140.8880.0651.0000.0210.0090.1200.028
person_gender0.0260.0050.0050.0000.0000.0000.0000.0240.0000.0211.0000.0000.0090.000
person_home_ownership0.0280.0000.0900.0840.0820.0910.2580.0150.0060.0090.0001.0000.0080.140
person_income0.0930.0230.405-0.0330.010-0.3530.0090.1430.0040.1200.0090.0081.0000.008
previous_loan_defaults_on_file0.0260.1780.0660.1980.0800.2200.5430.0300.0400.0280.0000.1400.0081.000

Missing values

2024-12-19T15:07:53.079969image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-19T15:07:53.167041image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

person_ageperson_genderperson_educationperson_incomeperson_emp_expperson_home_ownershiploan_amntloan_intentloan_int_rateloan_percent_incomecb_person_cred_hist_lengthcredit_scoreprevious_loan_defaults_on_fileloan_status
022.0femaleMaster71948.00RENT35000.0PERSONAL16.020.493.0561No1
121.0femaleHigh School12282.00OWN1000.0EDUCATION11.140.082.0504Yes0
225.0femaleHigh School12438.03MORTGAGE5500.0MEDICAL12.870.443.0635No1
323.0femaleBachelor79753.00RENT35000.0MEDICAL15.230.442.0675No1
424.0maleMaster66135.01RENT35000.0MEDICAL14.270.534.0586No1
521.0femaleHigh School12951.00OWN2500.0VENTURE7.140.192.0532No1
626.0femaleBachelor93471.01RENT35000.0EDUCATION12.420.373.0701No1
724.0femaleHigh School95550.05RENT35000.0MEDICAL11.110.374.0585No1
824.0femaleAssociate100684.03RENT35000.0PERSONAL8.900.352.0544No1
921.0femaleHigh School12739.00OWN1600.0VENTURE14.740.133.0640No1
person_ageperson_genderperson_educationperson_incomeperson_emp_expperson_home_ownershiploan_amntloan_intentloan_int_rateloan_percent_incomecb_person_cred_hist_lengthcredit_scoreprevious_loan_defaults_on_fileloan_status
4499031.0maleMaster136832.09RENT12319.0PERSONAL16.920.097.0722No1
4499124.0maleHigh School37786.00MORTGAGE13500.0EDUCATION13.430.364.0612No1
4499223.0femaleBachelor40925.00RENT9000.0PERSONAL11.010.224.0487No1
4499327.0femaleHigh School35512.04RENT5000.0PERSONAL15.830.145.0505No1
4499424.0femaleAssociate31924.02RENT12229.0MEDICAL10.700.384.0678No1
4499527.0maleAssociate47971.06RENT15000.0MEDICAL15.660.313.0645No1
4499637.0femaleAssociate65800.017RENT9000.0HOMEIMPROVEMENT14.070.1411.0621No1
4499733.0maleAssociate56942.07RENT2771.0DEBTCONSOLIDATION10.020.0510.0668No1
4499829.0maleBachelor33164.04RENT12000.0EDUCATION13.230.366.0604No1
4499924.0maleHigh School51609.01RENT6665.0DEBTCONSOLIDATION17.050.133.0628No1